ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

ECPR

Install the app

Install this application on your home screen for quick and easy access when you’re on the go.

Just tap Share then “Add to Home Screen”

Your subscription could not be saved. Please try again.
Your subscription to the ECPR Methods School offers and updates newsletter has been successful.

Discover ECPR's Latest Methods Course Offerings

We use Brevo as our email marketing platform. By clicking below to submit this form, you acknowledge that the information you provided will be transferred to Brevo for processing in accordance with their terms of use.

Panel Data Analysis

Course Dates and Times

Monday 17 – Friday 21 February 2019, 14:00 – 17:30 (finishing slightly earlier on Friday)
15 hours over five days

Andrew X. Li

lixiang577@gmail.com

Central European University

This course builds on ordinary least square (OLS) regression and extends it to data with a time-series-cross-sectional structure.

It places an emphasis on the connections between panel data methods and causal inference, which is the primary goal of social science research.

Panel data is particularly advantageous for causal inference because it allows researchers to control for entity-specific factors (individual heterogeneity) such as geography and culture, which may not be observed and can be difficult to measure.

Furthermore, panel data usually contain more degrees of freedom and more sample variability than purely cross-sectional data or time-series data, and hence improve the efficiency of the parameter estimates.

The course begins with a quick review of OLS regression and its key assumptions.

It then moves on with simple panel data methods, fixed and random effect estimators and more advanced methods such as clustered samples, panel instrumental variable methods, panel corrected standard error estimator and dynamic panel regressions, depending on time availability. Two or three lab sessions put these methods into practice.

The last session is a seminar in which those who want to earn extra credits can present their research or research proposals and receive feedback from the Instructor and their peers.

Please do bring along your own data. 

Tasks for ECTS credits

2 credits Attend 90% of class hours and participate actively in class.

3 credits As above, plus give a presentation of your research or research proposals during the final seminar session.

4 credits As above, plus submit a research paper / proposal of 2,500–3,000 words that uses panel data analysis in the research design. 


Instructor Bio

Andrew is an assistant professor at CEU's Department of International Relations. He obtained his PhD from the National University of Singapore and King’s College London.

His research interests include international political economy, research design, and quantitative methods. He teaches the Research Design and Methods in IR course series at CEU.

@lixiang577

The course consists of three parts:

  1. We discuss the logic and assumptions underlying panel data methods. You will learn how the development of more advanced methods is driven by the need to address potential violations of these assumptions
  2. We focus on the various statistical approaches and 'tricks' available to social scientists to deal with such violations and problems hidden in their data, allowing them to obtain estimates of effects that are as close as possible to the true causal effects
  3. We focus on applying the wide range of panel data methods discussed in the previous parts to substantive research questions of interest. You will learn how these methods can be used to provide answers to your own research questions.

Day 1
We begin with a quick review of OLS regression and the set of assumptions that is required for the OLS estimator to be the best linear unbiased estimator (BLUE). We will draw connections between regression analysis and causal inference in the context of observational studies. In the second half of the day, we will look at simple panel data methods. Specifically, we make a distinction between 'independently pooled across section' data and 'panel' data, the latter being the focus of this course. We shall see that panel data differs in some important respects from an independently pooled cross section in that a panel consists of the same entities (individuals, firms, countries or whatever) across time. We will then study two basic panel data methods, namely two-period panel data analysis and first differencing. 

Day 2
We focus on slightly more advanced methods for estimating the unobserved effects in the context of panel data analysis. We introduce the fixed effects estimator, which, like first differencing, uses a transformation to remove the unobserved effect prior to estimation. As a result, any time-constant explanatory variables are removed in the process. In contrast, we introduce the random effect estimator, which looks attractive when we think the unobserved effect is uncorrelated with all the explanatory variables. For example, when we have good knowledge about the factors affecting the dependent variable and have controlled for these factors in the equation, random effect estimator can sometimes be the preferred strategy. With these foundations, we will then look into a relatively new correlated random effects approach, which provides a synthesis of fixed effects and random effects methods and has been shown to be very useful. You will see that the usefulness of these methods critically hinges on the assumptions made with respect to the error term and the relationship between the regressors and the error term.

Day 3
We begin with a lab session, during which we put into practice the methods introduced over the previous two days. We carry out these analyses in Stata and/or R and learn about the interpretation of the results. This is also a good opportunity for students who have brought their own data to carry out the analysis for their current research projects. In the second session, we come back to the classroom to learn more about the research designs and methods to which panel data can be applied. From now, we start to relax the assumptions made in the previous discussions and learn methods designed to deal with various violations of these assumptions. The first is instrumental variable (IV) method, which deals with violations to the strict exogeneity assumption. This is a good opportunity for students to see the connection between panel data methods and causal inference. You will learn how instrumental variable in regression analysis plays the role of randomisation in experiments.

Day 4
We introduce several more advanced panel data methods that address further violations of the standard OLS assumptions. Since panel data contains repeated observations of the same entity over time, we may have different error variances for different panels as well as correlation of the error terms across panels and/or time. To deal with these potential challenges, we study methods such as clustering and robust estimation, panel-corrected standard error (PCSE) estimates and dynamic panel methods (Arellano-Bond and system GMM estimators). For these advanced methods, I will introduce the matrix notation of regression to help students visualise what each of these methods does to the variance-covariance matrix.

Day 5
In the first session we return to the lab and put these more advanced methods into practice. We learn how to carry out panel IV analysis, obtain the PCSE, Arellano-Bond and system GMM estimators and calculate cluster robust standard errors. Again, this is a good opportunity for students to carry out the analyses using their own datasets and check the robustness of the results across various model specifications. The last session is a seminar. Students who want extra credits will present their research or research proposals that use panel data methods and receive feedback from the Instructor and fellow participants.


This course is a general survey of panel data methods. The limited time available means we won’t have the luxury of going deeply into methodological niches, such as presenting all the techniques for tackling autocorrelation. The course aims to achieve a balance between econometric theories and practical/implementation. If you are interested only in implementing panel data methods, you may find this course less beneficial.

You should be familiar with basic statistical concepts such as sample mean and sample variance as well as their properties.

You should also have basic knowledge of regression analysis (up to multiple regression), and basic skills in Stata and/or R.

If you don't, we strongly encourage you to take other relevant Winter School courses, either before or concurrently with this course.

Day Topic Details
1 Panel data and causality, describing and modelling panel data
2 Pooled OLS, fixed and random effects
3 Growth curve models
4 Age-period-cohort models
5 Hybrid models, dynamic panel models, miscellaneous topics
2 Fixed effect estimator Random effect estimator Correlated random effects approach

3 hour lecture 

1 Review of OLS regression and causal inference Simple panel data methods

1.5 hour lecture

1.5 hour lecture

3 Practical session Instrumental variable methods

1.5 hour lab

1.5 hour lecture

4 Regression in matrix form Clustering and robust estimation Panel-corrected standard error Dynamic panel models

3 hour lecture

5 Practical session Student presentations

1.5 hours lab

1.5 hours seminar

Day Readings
1

Brady, Henry E. (2010) Causation and Explanation in Social Science. Pp. 217-270 in: J.M. Box-Steffensmeier, H.E. Brady and D. Collier (eds.) The Oxford Handbook of. Political Methodology. Oxford University Press.

Lynn, Peter (2009) Methods for Longitudinal Data. In Methodology of longitudinal surveys, Hrsg. Peter Lynn, Pp 1-20. Chichester: Wiley.

2

Cameron, Collin A. & Trivedi, Pravin K. (2005) Microeconometrics: Methods and Applications. Cambridge University Press, Pp 254-285.

Wooldridge, Jeffrey M. (2003) Introductory Econometrics. South-Western College Pub, Pp 426-469.  

3

Singer, Judith D. & Willet, John B. (2003) Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. Oxford University Press, Pp 45-75.

Cameron, Collin A. & Trivedi, Pravin K. (2005) Microeconometrics: Methods and Applications. Cambridge University Press, Pp 305-313.

Rabe-Hesketh, Sophia & Skrondal, Anders (2005) Multilevel and Longitudinal Modeling Using Stata, 1stedition, Pp 57-74.

4

Robert M. O’Brien (2014) Age-Period-Cohort Models. Chapmann & Hall/CRC.

5

Wooldrige, J (2010) Econometric Analysis of Cross Section and Panel Data. MIT Press.

Hsiao, Cheng (2003) Analysis of Panel Data.Cambridge University Press.

1

Rubin, Donald B
Estimating causal effects of treatments in randomized and nonrandomized studies
Journal of Educational Psychology 66, no. 5 (1974): 688–701

Wooldridge, Jeffrey M
Introductory Econometrics: A Modern Approach, Chapter 13, pp.402–433
Cengage Learning, 2016

2

Wooldridge, Jeffrey M
Introductory Econometrics: A Modern Approach, Chapter 14, pp.434–457
Cengage Learning, 2016

Beck, Nathaniel
Time-series–cross-section data: What have we learned in the past few years?
Annual Review of Political Science 4, no. 1 (2001): 271–293

3

Cameron, Adrian Colin and Pravin K. Trivedi
Microeconometrics Using Stata, Chapter 8, pp.229–280
College Station, TX: Stata Press, 2009

Wooldridge, Jeffrey M
Econometric Analysis of Cross Section and Panel Data, Chapter 11, pp.345–394
MIT Press, 2010

4

Beck, Nathaniel, and Jonathan N. Katz
What to do (and not to do) with time-series cross-section data
American Political Science Review 89, no. 3 (1995): 634–647

Arellano, Manuel, and Olympia Bover
Another Look at the Instrumental Variable Estimation of Error-components Models
Journal of Econometrics 68, no. 1 (1995): 29–51

Bond, Stephen R
Dynamic panel data models: a guide to micro data methods and practice
Portuguese Economic Journal 1, no. 2 (2002): 141–162

Wooldridge, Jeffrey M
Econometric Analysis of Cross Section and Panel Data, Chapter 20, pp.853–902
MIT Press, 2010

5

Cameron, Adrian Colin and Pravin K. Trivedi
Microeconometrics Using Stata, Chapter 9, pp.281–312
College Station, TX: Stata Press, 2010

Software Requirements

Stata version 16

R version 3.6.1 or above

Literature

Holland, Paul W
Statistics and Causal Inference
Journal of the American Statistical Association 81, no. 396 (1986): 945–960

Nickell, Stephen
Biases in dynamic models with fixed effects
Econometrica: Journal of the Econometric Society (1981): 1417–1426

Imbens, Guido W., and Jeffrey M. Wooldridge
Recent developments in the econometrics of program evaluation
Journal of Economic Literature 47, no. 1 (2009): 5–86

Blundell, Richard, Stephen Bond, and Frank Windmeijer
Estimation in dynamic panel data models: improving on the performance of the standard GMM estimator
In: Baltagi, Badi H. (ed.). Nonstationary panels, panel cointegration, and dynamic panels, pp. 53–91
Emerald Group Publishing Limited, 2001

Angrist, Joshua D., and Jörn-Steffen Pischke
Mostly Harmless Econometrics: An Empiricist's Companion
Princeton University Press, 2009

Recommended Courses to Cover Before this One

Summer School

Introduction to Stata
Introduction to Inferential Statistics: What you need to know before you take regression
Multiple Regression Analysis: Estimation, Diagnostics, and Modelling

Winter School

Introduction to Stata
Introduction to R (entry level)
Regression Refresher

Recommended Courses to Cover After this One

Summer School

Advanced Topics in Applied Regression
Causal Inference in the Social Sciences II: Difference in Difference, Regression Discontinuity and Instruments

Winter School

Multilevel Regression Modelling